Overview

Dataset statistics

Number of variables24
Number of observations1000000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory672.3 MiB
Average record size in memory705.0 B

Variable types

NUM12
CAT11
BOOL1

Reproduction

Analysis started2020-10-16 11:16:57.187956
Analysis finished2020-10-16 11:20:51.302344
Duration3 minutes and 54.11 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

sid has a high cardinality: 2686 distinct values High cardinality
sdomain has a high cardinality: 2882 distinct values High cardinality
aid has a high cardinality: 3149 distinct values High cardinality
adomain has a high cardinality: 199 distinct values High cardinality
did has a high cardinality: 150266 distinct values High cardinality
dip has a high cardinality: 555865 distinct values High cardinality
dmodel has a high cardinality: 5150 distinct values High cardinality
hour is highly correlated with df_indexHigh correlation
df_index is highly correlated with hourHigh correlation
E is highly correlated with BHigh correlation
B is highly correlated with EHigh correlation
df_index has unique values Unique
dtype has 55098 (5.5%) zeros Zeros
pos has 719953 (72.0%) zeros Zeros

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct count1000000
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20231351.42731
Minimum2
Maximum40428890
Zeros0
Zeros (%)0.0%
Memory size7.6 MiB

Quantile statistics

Minimum2
5-th percentile2023981.75
Q110124574
median20250724.5
Q330332100.5
95-th percentile38401285.75
Maximum40428890
Range40428888
Interquartile range (IQR)20207526.5

Descriptive statistics

Standard deviation11671953.82
Coefficient of variation (CV)0.5769240803
Kurtosis-1.200057968
Mean20231351.43
Median Absolute Deviation (MAD)10102820.5
Skewness-0.002827670346
Sum2.023135143e+13
Variance1.362345059e+14
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
117799871< 0.1%
 
49469981< 0.1%
 
175196671< 0.1%
 
112261601< 0.1%
 
94119541< 0.1%
 
385178061< 0.1%
 
175319451< 0.1%
 
236106821< 0.1%
 
7833991< 0.1%
 
91740541< 0.1%
 
361677961< 0.1%
 
321280301< 0.1%
 
216361251< 0.1%
 
237291791< 0.1%
 
321198341< 0.1%
 
27535611< 0.1%
 
208244331< 0.1%
 
100510401< 0.1%
 
363387101< 0.1%
 
266035511< 0.1%
 
27740351< 0.1%
 
342272171< 0.1%
 
127503741< 0.1%
 
279644311< 0.1%
 
384440421< 0.1%
 
Other values (999975)999975> 99.9%
 
ValueCountFrequency (%) 
21< 0.1%
 
51< 0.1%
 
111< 0.1%
 
221< 0.1%
 
701< 0.1%
 
771< 0.1%
 
881< 0.1%
 
991< 0.1%
 
1221< 0.1%
 
1411< 0.1%
 
ValueCountFrequency (%) 
404288901< 0.1%
 
404288831< 0.1%
 
404288811< 0.1%
 
404288771< 0.1%
 
404288511< 0.1%
 
404287861< 0.1%
 
404287521< 0.1%
 
404287511< 0.1%
 
404287261< 0.1%
 
404285911< 0.1%
 

like
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
0
830111
1
169889
ValueCountFrequency (%) 
083011183.0%
 
116988917.0%
 

hour
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count240
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19122558.732911
Minimum19122100
Maximum19123023
Zeros0
Zeros (%)0.0%
Memory size7.6 MiB

Quantile statistics

Minimum19122100
5-th percentile19122109
Q119122304
median19122602
Q319122814
95-th percentile19123012
Maximum19123023
Range923
Interquartile range (IQR)510

Descriptive statistics

Standard deviation296.7519741
Coefficient of variation (CV)1.551842398e-05
Kurtosis-1.336231511
Mean19122558.73
Median Absolute Deviation (MAD)287
Skewness-0.00760012114
Sum1.912255873e+13
Variance88061.73411
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
19122209109881.1%
 
19122210108041.1%
 
19122813106321.1%
 
19122212101511.0%
 
1912281497391.0%
 
1912221194530.9%
 
1912300485290.9%
 
1912280981880.8%
 
1912220879000.8%
 
1912221378500.8%
 
1912280872300.7%
 
1912220571090.7%
 
1912281570660.7%
 
1912220670040.7%
 
1912281669380.7%
 
1912281768930.7%
 
1912230467960.7%
 
1912210567810.7%
 
1912281266470.7%
 
1912210465490.7%
 
1912241765070.7%
 
1912301464920.6%
 
1912281164640.6%
 
1912281063990.6%
 
1912300561820.6%
 
Other values (215)80470980.5%
 
ValueCountFrequency (%) 
1912210029350.3%
 
1912210134440.3%
 
1912210250590.5%
 
1912210348520.5%
 
1912210465490.7%
 
1912210567810.7%
 
1912210658880.6%
 
1912210750940.5%
 
1912210851490.5%
 
1912210956580.6%
 
ValueCountFrequency (%) 
1912302319540.2%
 
1912302225280.3%
 
1912302127690.3%
 
1912302027430.3%
 
1912301933100.3%
 
1912301839010.4%
 
1912301744010.4%
 
1912301652320.5%
 
1912301558590.6%
 
1912301464920.6%
 

sid
Categorical

HIGH CARDINALITY

Distinct count2686
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
85f751fd
360971
1fbe01fe
160356
e151e245
 
65115
d9750ee7
 
23541
5b08c53b
 
22730
Other values (2681)
367287
ValueCountFrequency (%) 
85f751fd36097136.1%
 
1fbe01fe16035616.0%
 
e151e245651156.5%
 
d9750ee7235412.4%
 
5b08c53b227302.3%
 
5b4d2eda192441.9%
 
856e6d3f191201.9%
 
a7853007113891.1%
 
b7e9786d91280.9%
 
5ee41ff286580.9%
 
6399eda686570.9%
 
5bcf81a282860.8%
 
6256f5b478480.8%
 
57ef2c8776220.8%
 
17caea1468460.7%
 
83a0ad1a66860.7%
 
57fe1b2066660.7%
 
0a74291466650.7%
 
e4d8dd7b64370.6%
 
e8f79e6063520.6%
 
d613791557690.6%
 
6c5b482c48390.5%
 
12fb412146820.5%
 
93eaba7445040.5%
 
e5c60a0544850.4%
 
Other values (2661)20340420.3%
 

Length

Max length8
Median length8
Mean length8
Min length8

Overview of Unicode Properties

Unique unicode characters16
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
f121869915.2%
 
5113429814.2%
 
199477712.4%
 
e7437839.3%
 
76253837.8%
 
d5836467.3%
 
85507536.9%
 
b3801984.8%
 
03541134.4%
 
42511313.1%
 
22413313.0%
 
a2063992.6%
 
61959102.4%
 
91794222.2%
 
31738692.2%
 
c1662882.1%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number470098758.8%
 
Lowercase Letter329901341.2%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
5113429824.1%
 
199477721.2%
 
762538313.3%
 
855075311.7%
 
03541137.5%
 
42511315.3%
 
22413315.1%
 
61959104.2%
 
91794223.8%
 
31738693.7%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
f121869936.9%
 
e74378322.5%
 
d58364617.7%
 
b38019811.5%
 
a2063996.3%
 
c1662885.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Common470098758.8%
 
Latin329901341.2%
 

Most frequent Common characters

ValueCountFrequency (%) 
5113429824.1%
 
199477721.2%
 
762538313.3%
 
855075311.7%
 
03541137.5%
 
42511315.3%
 
22413315.1%
 
61959104.2%
 
91794223.8%
 
31738693.7%
 

Most frequent Latin characters

ValueCountFrequency (%) 
f121869936.9%
 
e74378322.5%
 
d58364617.7%
 
b38019811.5%
 
a2063996.3%
 
c1662885.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8000000100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
f121869915.2%
 
5113429814.2%
 
199477712.4%
 
e7437839.3%
 
76253837.8%
 
d5836467.3%
 
85507536.9%
 
b3801984.8%
 
03541134.4%
 
42511313.1%
 
22413313.0%
 
a2063992.6%
 
61959102.4%
 
91794222.2%
 
31738692.2%
 
c1662882.1%
 

sdomain
Categorical

HIGH CARDINALITY

Distinct count2882
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
c4e18dd6
374079
f3845767
160356
7e091613
82047
7687a86e
 
32073
98572c79
 
24372
Other values (2877)
327073
ValueCountFrequency (%) 
c4e18dd637407937.4%
 
f384576716035616.0%
 
7e091613820478.2%
 
7687a86e320733.2%
 
98572c79243722.4%
 
16a36ef3213822.1%
 
58a89a43191201.9%
 
b12b9f8592590.9%
 
9d54950b92060.9%
 
17d996e687740.9%
 
968765cd86570.9%
 
28f9302978480.8%
 
bd6d812f76220.8%
 
d262cf1e72190.7%
 
0dde25ec68460.7%
 
5b62659666900.7%
 
5c9ae86766860.7%
 
510bd83966650.7%
 
a17bde6864370.6%
 
c434278463520.6%
 
6b59f07958910.6%
 
bb1ef33457690.6%
 
7256c62354610.5%
 
a434fa4250490.5%
 
3f2f381940940.4%
 
Other values (2857)16204616.2%
 

Length

Max length8
Median length8
Mean length8
Min length8

Overview of Unicode Properties

Unique unicode characters16
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
d90293811.3%
 
689102811.1%
 
881212610.2%
 
16940398.7%
 
46777788.5%
 
76420498.0%
 
e6337097.9%
 
c5373476.7%
 
34394385.5%
 
53648824.6%
 
93374744.2%
 
f3117383.9%
 
22021762.5%
 
01942942.4%
 
a1897472.4%
 
b1692372.1%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number525528465.7%
 
Lowercase Letter274471634.3%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
d90293832.9%
 
e63370923.1%
 
c53734719.6%
 
f31173811.4%
 
a1897476.9%
 
b1692376.2%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
689102817.0%
 
881212615.5%
 
169403913.2%
 
467777812.9%
 
764204912.2%
 
34394388.4%
 
53648826.9%
 
93374746.4%
 
22021763.8%
 
01942943.7%
 

Most occurring scripts

ValueCountFrequency (%) 
Common525528465.7%
 
Latin274471634.3%
 

Most frequent Latin characters

ValueCountFrequency (%) 
d90293832.9%
 
e63370923.1%
 
c53734719.6%
 
f31173811.4%
 
a1897476.9%
 
b1692376.2%
 

Most frequent Common characters

ValueCountFrequency (%) 
689102817.0%
 
881212615.5%
 
169403913.2%
 
467777812.9%
 
764204912.2%
 
34394388.4%
 
53648826.9%
 
93374746.4%
 
22021763.8%
 
01942943.7%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8000000100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
d90293811.3%
 
689102811.1%
 
881212610.2%
 
16940398.7%
 
46777788.5%
 
76420498.0%
 
e6337097.9%
 
c5373476.7%
 
34394385.5%
 
53648824.6%
 
93374744.2%
 
f3117383.9%
 
22021762.5%
 
01942942.4%
 
a1897472.4%
 
b1692372.1%
 

scat
Categorical

Distinct count20
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
50e219e0
409036
f028772b
313005
28905ebd
182134
3e814130
 
75650
f66779e6
 
6288
Other values (15)
 
13887
ValueCountFrequency (%) 
50e219e040903640.9%
 
f028772b31300531.3%
 
28905ebd18213418.2%
 
3e814130756507.6%
 
f66779e662880.6%
 
75fa27f640370.4%
 
335d28a834050.3%
 
76b2941d26500.3%
 
c0dd3be310450.1%
 
727225517320.1%
 
70fb0e296240.1%
 
dedf689d5840.1%
 
0569f928394< 0.1%
 
8fd0aea4206< 0.1%
 
a818d37a87< 0.1%
 
42a36e1461< 0.1%
 
e787de0e24< 0.1%
 
bcf865d921< 0.1%
 
5378d02814< 0.1%
 
9ccfa2ea3< 0.1%
 

Length

Max length8
Median length8
Mean length8
Min length8

Overview of Unicode Properties

Unique unicode characters16
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
0139179217.4%
 
2123056415.4%
 
e108473913.6%
 
76515478.1%
 
96021287.5%
 
56005057.5%
 
85790307.2%
 
15638667.0%
 
b4994796.2%
 
f3291994.1%
 
d1923832.4%
 
31603622.0%
 
4786281.0%
 
6266110.3%
 
a80950.1%
 
c1072< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number588503373.6%
 
Lowercase Letter211496726.4%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
0139179223.6%
 
2123056420.9%
 
765154711.1%
 
960212810.2%
 
560050510.2%
 
85790309.8%
 
15638669.6%
 
31603622.7%
 
4786281.3%
 
6266110.5%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e108473951.3%
 
b49947923.6%
 
f32919915.6%
 
d1923839.1%
 
a80950.4%
 
c10720.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Common588503373.6%
 
Latin211496726.4%
 

Most frequent Common characters

ValueCountFrequency (%) 
0139179223.6%
 
2123056420.9%
 
765154711.1%
 
960212810.2%
 
560050510.2%
 
85790309.8%
 
15638669.6%
 
31603622.7%
 
4786281.3%
 
6266110.5%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e108473951.3%
 
b49947923.6%
 
f32919915.6%
 
d1923839.1%
 
a80950.4%
 
c10720.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8000000100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
0139179217.4%
 
2123056415.4%
 
e108473913.6%
 
76515478.1%
 
96021287.5%
 
56005057.5%
 
85790307.2%
 
15638667.0%
 
b4994796.2%
 
f3291994.1%
 
d1923832.4%
 
31603622.0%
 
4786281.0%
 
6266110.3%
 
a80950.1%
 
c1072< 0.1%
 

aid
Categorical

HIGH CARDINALITY

Distinct count3149
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
ecad2386
639029
92f5800b
 
38510
e2fcccd2
 
27831
febd1138
 
18987
9c13b419
 
18782
Other values (3144)
256861
ValueCountFrequency (%) 
ecad238663902963.9%
 
92f5800b385103.9%
 
e2fcccd2278312.8%
 
febd1138189871.9%
 
9c13b419187821.9%
 
7358e05e151251.5%
 
a5184c22119881.2%
 
d36838b1112991.1%
 
685d1c4c100851.0%
 
54c5d54599241.0%
 
03528b2779500.8%
 
f0d41ff172190.7%
 
e2a1ca3769170.7%
 
e973982869000.7%
 
51cedd4e59660.6%
 
66f5e02e57040.6%
 
03a08c3f53030.5%
 
98fed79152860.5%
 
7320639749610.5%
 
f53417e148790.5%
 
e96773f044030.4%
 
ce183bbd37060.4%
 
be7c618d29460.3%
 
f888bf4c25880.3%
 
1dc72b4d25390.3%
 
Other values (3124)12117312.1%
 

Length

Max length8
Median length8
Mean length8
Min length8

Overview of Unicode Properties

Unique unicode characters16
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
c87140910.9%
 
886139210.8%
 
285433110.7%
 
383232210.4%
 
e83230810.4%
 
d81899410.2%
 
67431519.3%
 
a7371519.2%
 
f2143622.7%
 
52095962.6%
 
12024232.5%
 
01902382.4%
 
91732262.2%
 
b1700832.1%
 
41538411.9%
 
71351731.7%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number435569354.4%
 
Lowercase Letter364430745.6%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
c87140923.9%
 
e83230822.8%
 
d81899422.5%
 
a73715120.2%
 
f2143625.9%
 
b1700834.7%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
886139219.8%
 
285433119.6%
 
383232219.1%
 
674315117.1%
 
52095964.8%
 
12024234.6%
 
01902384.4%
 
91732264.0%
 
41538413.5%
 
71351733.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Common435569354.4%
 
Latin364430745.6%
 

Most frequent Latin characters

ValueCountFrequency (%) 
c87140923.9%
 
e83230822.8%
 
d81899422.5%
 
a73715120.2%
 
f2143625.9%
 
b1700834.7%
 

Most frequent Common characters

ValueCountFrequency (%) 
886139219.8%
 
285433119.6%
 
383232219.1%
 
674315117.1%
 
52095964.8%
 
12024234.6%
 
01902384.4%
 
91732264.0%
 
41538413.5%
 
71351733.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8000000100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
c87140910.9%
 
886139210.8%
 
285433110.7%
 
383232210.4%
 
e83230810.4%
 
d81899410.2%
 
67431519.3%
 
a7371519.2%
 
f2143622.7%
 
52095962.6%
 
12024232.5%
 
01902382.4%
 
91732262.2%
 
b1700832.1%
 
41538411.9%
 
71351731.7%
 

adomain
Categorical

HIGH CARDINALITY

Distinct count199
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
7801e8d9
673785
2347f47a
 
129463
ae637522
 
46658
5c5a694b
 
27835
82e27996
 
18988
Other values (194)
 
103271
ValueCountFrequency (%) 
7801e8d967378567.4%
 
2347f47a12946312.9%
 
ae637522466584.7%
 
5c5a694b278352.8%
 
82e27996189881.9%
 
d9b5648e176631.8%
 
0e8616ad162721.6%
 
b9528b13158921.6%
 
b8d325c3129911.3%
 
aefc06bd74630.7%
 
df32afa970700.7%
 
33da2e7464220.6%
 
6f7ca2ba57050.6%
 
5b9c592b25880.3%
 
885c7f3f17310.2%
 
5c620f0415080.2%
 
45a51db414250.1%
 
b5f3b24a11730.1%
 
813f33235980.1%
 
0654b4445970.1%
 
ad63ec9b433< 0.1%
 
c6824def387< 0.1%
 
a8b0bf20343< 0.1%
 
15ec7f39319< 0.1%
 
99b4c806269< 0.1%
 
Other values (174)24220.2%
 

Length

Max length8
Median length8
Mean length8
Min length8

Overview of Unicode Properties

Unique unicode characters16
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
8143570617.9%
 
7101396712.7%
 
e7899669.9%
 
97875369.8%
 
d7450889.3%
 
17092248.9%
 
07029018.8%
 
43211744.0%
 
23166904.0%
 
a2643363.3%
 
32454593.1%
 
f1662252.1%
 
51631782.0%
 
61610642.0%
 
b1153381.4%
 
c621480.8%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number585689973.2%
 
Lowercase Letter214310126.8%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e78996636.9%
 
d74508834.8%
 
a26433612.3%
 
f1662257.8%
 
b1153385.4%
 
c621482.9%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
8143570624.5%
 
7101396717.3%
 
978753613.4%
 
170922412.1%
 
070290112.0%
 
43211745.5%
 
23166905.4%
 
32454594.2%
 
51631782.8%
 
61610642.7%
 

Most occurring scripts

ValueCountFrequency (%) 
Common585689973.2%
 
Latin214310126.8%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e78996636.9%
 
d74508834.8%
 
a26433612.3%
 
f1662257.8%
 
b1153385.4%
 
c621482.9%
 

Most frequent Common characters

ValueCountFrequency (%) 
8143570624.5%
 
7101396717.3%
 
978753613.4%
 
170922412.1%
 
070290112.0%
 
43211745.5%
 
23166905.4%
 
32454594.2%
 
51631782.8%
 
61610642.7%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8000000100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
8143570617.9%
 
7101396712.7%
 
e7899669.9%
 
97875369.8%
 
d7450889.3%
 
17092248.9%
 
07029018.8%
 
43211744.0%
 
23166904.0%
 
a2643363.3%
 
32454593.1%
 
f1662252.1%
 
51631782.0%
 
61610642.0%
 
b1153381.4%
 
c621480.8%
 

acat
Categorical

Distinct count26
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
07d7df22
647371
0f2161f8
236729
cef3e649
 
42540
8ded1f7a
 
36074
f95efa07
 
28188
Other values (21)
 
9098
ValueCountFrequency (%) 
07d7df2264737164.7%
 
0f2161f823672923.7%
 
cef3e649425404.3%
 
8ded1f7a360743.6%
 
f95efa07281882.8%
 
d1327cf530830.3%
 
dc97ec0614290.1%
 
09481d6013900.1%
 
75d80bbe10190.1%
 
fc6fa53d5860.1%
 
4ce2e9fc496< 0.1%
 
879c24eb325< 0.1%
 
a3c42688278< 0.1%
 
4681bb9d172< 0.1%
 
0f9a328c111< 0.1%
 
2281a34054< 0.1%
 
a86a3e8953< 0.1%
 
8df2e84245< 0.1%
 
79f0b86016< 0.1%
 
a7fd01ec10< 0.1%
 
2fc4f2aa9< 0.1%
 
7113d72a8< 0.1%
 
18b1e0be6< 0.1%
 
0bfbc3585< 0.1%
 
5326cf992< 0.1%
 

Length

Max length8
Median length8
Mean length8
Min length8

Overview of Unicode Properties

Unique unicode characters16
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
2153599219.2%
 
d137463417.2%
 
7136490217.1%
 
f126077715.8%
 
091773511.5%
 
15142696.4%
 
62831953.5%
 
82766543.5%
 
e1532271.9%
 
9747240.9%
 
a654330.8%
 
c507990.6%
 
3467200.6%
 
4453090.6%
 
5328840.4%
 
b2746< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number509238463.7%
 
Lowercase Letter290761636.3%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
2153599230.2%
 
7136490226.8%
 
091773518.0%
 
151426910.1%
 
62831955.6%
 
82766545.4%
 
9747241.5%
 
3467200.9%
 
4453090.9%
 
5328840.6%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
d137463447.3%
 
f126077743.4%
 
e1532275.3%
 
a654332.3%
 
c507991.7%
 
b27460.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Common509238463.7%
 
Latin290761636.3%
 

Most frequent Common characters

ValueCountFrequency (%) 
2153599230.2%
 
7136490226.8%
 
091773518.0%
 
151426910.1%
 
62831955.6%
 
82766545.4%
 
9747241.5%
 
3467200.9%
 
4453090.9%
 
5328840.6%
 

Most frequent Latin characters

ValueCountFrequency (%) 
d137463447.3%
 
f126077743.4%
 
e1532275.3%
 
a654332.3%
 
c507991.7%
 
b27460.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8000000100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
2153599219.2%
 
d137463417.2%
 
7136490217.1%
 
f126077715.8%
 
091773511.5%
 
15142696.4%
 
62831953.5%
 
82766543.5%
 
e1532271.9%
 
9747240.9%
 
a654330.8%
 
c507990.6%
 
3467200.6%
 
4453090.6%
 
5328840.4%
 
b2746< 0.1%
 

did
Categorical

HIGH CARDINALITY

Distinct count150266
Unique (%)15.0%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
a99f214a
825245
0f7c61dc
 
508
c357dbff
 
467
936e92fb
 
353
afeffc18
 
234
Other values (150261)
173193
ValueCountFrequency (%) 
a99f214a82524582.5%
 
0f7c61dc5080.1%
 
c357dbff467< 0.1%
 
936e92fb353< 0.1%
 
afeffc18234< 0.1%
 
28dc8687121< 0.1%
 
987552d1109< 0.1%
 
cef4c8cc106< 0.1%
 
d857ffbb99< 0.1%
 
3cdb405294< 0.1%
 
b09da1c490< 0.1%
 
03559b2966< 0.1%
 
02da531257< 0.1%
 
096a6f3241< 0.1%
 
bbcf14e439< 0.1%
 
d2e4c0ab38< 0.1%
 
f1d9c74437< 0.1%
 
eec6d02234< 0.1%
 
c35f516833< 0.1%
 
e834332730< 0.1%
 
9af8747828< 0.1%
 
f58a1c3b26< 0.1%
 
0a04637d26< 0.1%
 
4e9e955025< 0.1%
 
abab24a725< 0.1%
 
Other values (150241)17206917.2%
 

Length

Max length8
Median length8
Mean length8
Min length8

Overview of Unicode Properties

Unique unicode characters16
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
9173729221.7%
 
a173689221.7%
 
f91431311.4%
 
491244611.4%
 
291235211.4%
 
191224911.4%
 
c884271.1%
 
3878081.1%
 
7877581.1%
 
5877001.1%
 
e874901.1%
 
d874771.1%
 
b874641.1%
 
6869931.1%
 
0869321.1%
 
8864071.1%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number499793762.5%
 
Lowercase Letter300206337.5%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
9173729234.8%
 
491244618.3%
 
291235218.3%
 
191224918.3%
 
3878081.8%
 
7877581.8%
 
5877001.8%
 
6869931.7%
 
0869321.7%
 
8864071.7%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a173689257.9%
 
f91431330.5%
 
c884272.9%
 
e874902.9%
 
d874772.9%
 
b874642.9%
 

Most occurring scripts

ValueCountFrequency (%) 
Common499793762.5%
 
Latin300206337.5%
 

Most frequent Common characters

ValueCountFrequency (%) 
9173729234.8%
 
491244618.3%
 
291235218.3%
 
191224918.3%
 
3878081.8%
 
7877581.8%
 
5877001.8%
 
6869931.7%
 
0869321.7%
 
8864071.7%
 

Most frequent Latin characters

ValueCountFrequency (%) 
a173689257.9%
 
f91431330.5%
 
c884272.9%
 
e874902.9%
 
d874772.9%
 
b874642.9%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8000000100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
9173729221.7%
 
a173689221.7%
 
f91431311.4%
 
491244611.4%
 
291235211.4%
 
191224911.4%
 
c884271.1%
 
3878081.1%
 
7877581.1%
 
5877001.1%
 
e874901.1%
 
d874771.1%
 
b874641.1%
 
6869931.1%
 
0869321.1%
 
8864071.1%
 

dip
Categorical

HIGH CARDINALITY

Distinct count555865
Unique (%)55.6%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
6b9769f2
 
5085
431b3174
 
3463
2f323f36
 
2198
930ec31d
 
2166
af9205f9
 
2163
Other values (555860)
984925
ValueCountFrequency (%) 
6b9769f250850.5%
 
431b317434630.3%
 
2f323f3621980.2%
 
930ec31d21660.2%
 
af9205f921630.2%
 
d90a777421050.2%
 
6394f6f620880.2%
 
af62faf420830.2%
 
009a786120300.2%
 
285aa37d20120.2%
 
c656330817930.2%
 
0489ce3f17560.2%
 
ddd2926e17540.2%
 
a8536f3a17310.2%
 
ceffea6917270.2%
 
488a9a3e17190.2%
 
1cf2971617160.2%
 
8a014cbb16880.2%
 
57cd400616840.2%
 
75bb1b5816710.2%
 
9b1fe27815910.2%
 
07875ea49330.1%
 
7ed30f6c9210.1%
 
b0070d9a8950.1%
 
ac77b71a8680.1%
 
Other values (555840)95216095.2%
 

Length

Max length8
Median length8
Mean length8
Min length8

Overview of Unicode Properties

Unique unicode characters16
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
f5183046.5%
 
35122246.4%
 
95121356.4%
 
65115286.4%
 
a5078546.3%
 
75025646.3%
 
44990496.2%
 
04982326.2%
 
14974146.2%
 
24963926.2%
 
b4950256.2%
 
84916066.1%
 
d4914756.1%
 
e4913766.1%
 
c4889016.1%
 
54859216.1%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number500706562.6%
 
Lowercase Letter299293537.4%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
351222410.2%
 
951213510.2%
 
651152810.2%
 
750256410.0%
 
449904910.0%
 
049823210.0%
 
14974149.9%
 
24963929.9%
 
84916069.8%
 
54859219.7%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
f51830417.3%
 
a50785417.0%
 
b49502516.5%
 
d49147516.4%
 
e49137616.4%
 
c48890116.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Common500706562.6%
 
Latin299293537.4%
 

Most frequent Common characters

ValueCountFrequency (%) 
351222410.2%
 
951213510.2%
 
651152810.2%
 
750256410.0%
 
449904910.0%
 
049823210.0%
 
14974149.9%
 
24963929.9%
 
84916069.8%
 
54859219.7%
 

Most frequent Latin characters

ValueCountFrequency (%) 
f51830417.3%
 
a50785417.0%
 
b49502516.5%
 
d49147516.4%
 
e49137616.4%
 
c48890116.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8000000100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
f5183046.5%
 
35122246.4%
 
95121356.4%
 
65115286.4%
 
a5078546.3%
 
75025646.3%
 
44990496.2%
 
04982326.2%
 
14974146.2%
 
24963926.2%
 
b4950256.2%
 
84916066.1%
 
d4914756.1%
 
e4913766.1%
 
c4889016.1%
 
54859216.1%
 

dmodel
Categorical

HIGH CARDINALITY

Distinct count5150
Unique (%)0.5%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
8a4875bd
 
60497
1f0bc64f
 
35049
d787e91b
 
34630
76dc4769
 
18999
be6db1d7
 
18212
Other values (5145)
832613
ValueCountFrequency (%) 
8a4875bd604976.0%
 
1f0bc64f350493.5%
 
d787e91b346303.5%
 
76dc4769189991.9%
 
be6db1d7182121.8%
 
a0f5f879161261.6%
 
4ea23a13160431.6%
 
7abbbd5c156401.6%
 
ecb851b2150971.5%
 
d4897fef119671.2%
 
5096d134117391.2%
 
711ee120110361.1%
 
1ccc7835105441.1%
 
e1eae715103841.0%
 
c6263d8a97521.0%
 
84ebbcd495361.0%
 
be74e6fe94400.9%
 
3bd9e8e789140.9%
 
0eb711ec88890.9%
 
0bcabeaf88810.9%
 
f07e20f888740.9%
 
3bb1ddd788710.9%
 
981edffc88090.9%
 
779d90c286970.9%
 
36b67a2a85850.9%
 
Other values (5125)61478961.5%
 

Length

Max length8
Median length8
Mean length8
Min length8

Overview of Unicode Properties

Unique unicode characters16
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
b6446338.1%
 
76035667.5%
 
e5902147.4%
 
d5650367.1%
 
85493226.9%
 
15447986.8%
 
45276286.6%
 
a4961276.2%
 
f4806016.0%
 
64719165.9%
 
54624595.8%
 
c4585785.7%
 
94391345.5%
 
03955104.9%
 
23853754.8%
 
33851034.8%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number476481159.6%
 
Lowercase Letter323518940.4%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
b64463319.9%
 
e59021418.2%
 
d56503617.5%
 
a49612715.3%
 
f48060114.9%
 
c45857814.2%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
760356612.7%
 
854932211.5%
 
154479811.4%
 
452762811.1%
 
64719169.9%
 
54624599.7%
 
94391349.2%
 
03955108.3%
 
23853758.1%
 
33851038.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Common476481159.6%
 
Latin323518940.4%
 

Most frequent Latin characters

ValueCountFrequency (%) 
b64463319.9%
 
e59021418.2%
 
d56503617.5%
 
a49612715.3%
 
f48060114.9%
 
c45857814.2%
 

Most frequent Common characters

ValueCountFrequency (%) 
760356612.7%
 
854932211.5%
 
154479811.4%
 
452762811.1%
 
64719169.9%
 
54624599.7%
 
94391349.2%
 
03955108.3%
 
23853758.1%
 
33851038.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8000000100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
b6446338.1%
 
76035667.5%
 
e5902147.4%
 
d5650367.1%
 
85493226.9%
 
15447986.8%
 
45276286.6%
 
a4961276.2%
 
f4806016.0%
 
64719165.9%
 
54624595.8%
 
c4585785.7%
 
94391345.5%
 
03955104.9%
 
23853754.8%
 
33851034.8%
 

dtype
Real number (ℝ≥0)

ZEROS

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.014972
Minimum0
Maximum5
Zeros55098
Zeros (%)5.5%
Memory size7.6 MiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q31
95-th percentile1
Maximum5
Range5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.5272325077
Coefficient of variation (CV)0.5194552241
Kurtosis27.85927964
Mean1.014972
Median Absolute Deviation (MAD)0
Skewness4.457522816
Sum1014972
Variance0.2779741172
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
192261992.3%
 
0550985.5%
 
4190591.9%
 
532230.3%
 
21< 0.1%
 
ValueCountFrequency (%) 
0550985.5%
 
192261992.3%
 
21< 0.1%
 
4190591.9%
 
532230.3%
 
ValueCountFrequency (%) 
532230.3%
 
4190591.9%
 
21< 0.1%
 
192261992.3%
 
0550985.5%
 

dconn
Categorical

Distinct count4
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
0
863420
2
 
81622
3
 
53894
5
 
1064
ValueCountFrequency (%) 
086342086.3%
 
2816228.2%
 
3538945.4%
 
510640.1%
 

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
086342086.3%
 
2816228.2%
 
3538945.4%
 
510640.1%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number1000000100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
086342086.3%
 
2816228.2%
 
3538945.4%
 
510640.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Common1000000100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
086342086.3%
 
2816228.2%
 
3538945.4%
 
510640.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1000000100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
086342086.3%
 
2816228.2%
 
3538945.4%
 
510640.1%
 

pos
Real number (ℝ≥0)

ZEROS

Distinct count7
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.287912
Minimum0
Maximum7
Zeros719953
Zeros (%)72.0%
Memory size7.6 MiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile1
Maximum7
Range7
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.5051998966
Coefficient of variation (CV)1.754702467
Kurtosis32.64191603
Mean0.287912
Median Absolute Deviation (MAD)0
Skewness3.322202321
Sum287912
Variance0.2552269355
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
071995372.0%
 
127828927.8%
 
710510.1%
 
2334< 0.1%
 
4193< 0.1%
 
5143< 0.1%
 
337< 0.1%
 
ValueCountFrequency (%) 
071995372.0%
 
127828927.8%
 
2334< 0.1%
 
337< 0.1%
 
4193< 0.1%
 
5143< 0.1%
 
710510.1%
 
ValueCountFrequency (%) 
710510.1%
 
5143< 0.1%
 
4193< 0.1%
 
337< 0.1%
 
2334< 0.1%
 
127828927.8%
 
071995372.0%
 

A
Real number (ℝ≥0)

Distinct count7
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1004.966775
Minimum1001
Maximum1012
Zeros0
Zeros (%)0.0%
Memory size7.6 MiB

Quantile statistics

Minimum1001
5-th percentile1002
Q11005
median1005
Q31005
95-th percentile1005
Maximum1012
Range11
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.093227467
Coefficient of variation (CV)0.001087824487
Kurtosis14.78001598
Mean1004.966775
Median Absolute Deviation (MAD)0
Skewness1.806447956
Sum1004966775
Variance1.195146295
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
100591862791.9%
 
1002550985.5%
 
1010222822.2%
 
101227580.3%
 
10078820.1%
 
1001210< 0.1%
 
1008143< 0.1%
 
ValueCountFrequency (%) 
1001210< 0.1%
 
1002550985.5%
 
100591862791.9%
 
10078820.1%
 
1008143< 0.1%
 
1010222822.2%
 
101227580.3%
 
ValueCountFrequency (%) 
101227580.3%
 
1010222822.2%
 
1008143< 0.1%
 
10078820.1%
 
100591862791.9%
 
1002550985.5%
 
1001210< 0.1%
 

B
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count2246
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18848.445596
Minimum375
Maximum24044
Zeros0
Zeros (%)0.0%
Memory size7.6 MiB

Quantile statistics

Minimum375
5-th percentile6393
Q116920
median20346
Q321894
95-th percentile23561
Maximum24044
Range23669
Interquartile range (IQR)4974

Descriptive statistics

Standard deviation4951.645567
Coefficient of variation (CV)0.2627084309
Kurtosis3.482762077
Mean18848.4456
Median Absolute Deviation (MAD)2336
Skewness-1.887985203
Sum1.88484456e+10
Variance24518793.82
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
4687233842.3%
 
21611226812.3%
 
21189189801.9%
 
21191189541.9%
 
19771182811.8%
 
19772180641.8%
 
16208164251.6%
 
20108144101.4%
 
8330137351.4%
 
19950130801.3%
 
15701127811.3%
 
15703126861.3%
 
15705125201.3%
 
15707123871.2%
 
15699123631.2%
 
15708123041.2%
 
15704121941.2%
 
15702119071.2%
 
15706117071.2%
 
16615109031.1%
 
23804101661.0%
 
2176792790.9%
 
2176890810.9%
 
2267685520.9%
 
1723983590.8%
 
Other values (2221)65481765.5%
 
ValueCountFrequency (%) 
37520650.2%
 
3765< 0.1%
 
37719870.2%
 
38018090.2%
 
38194< 0.1%
 
45122< 0.1%
 
45214510.1%
 
45414530.1%
 
4551< 0.1%
 
45615050.2%
 
ValueCountFrequency (%) 
240442< 0.1%
 
2404341< 0.1%
 
2404222< 0.1%
 
24041132< 0.1%
 
24040131< 0.1%
 
2403747< 0.1%
 
240366160.1%
 
240356240.1%
 
2403415230.2%
 
24033101< 0.1%
 

C
Real number (ℝ≥0)

Distinct count8
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean318.892852
Minimum120
Maximum1024
Zeros0
Zeros (%)0.0%
Memory size7.6 MiB

Quantile statistics

Minimum120
5-th percentile300
Q1320
median320
Q3320
95-th percentile320
Maximum1024
Range904
Interquartile range (IQR)0

Descriptive statistics

Standard deviation21.31636755
Coefficient of variation (CV)0.06684492117
Kurtosis342.6409646
Mean318.892852
Median Absolute Deviation (MAD)0
Skewness14.99252409
Sum318892852
Variance454.3875257
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
32093277293.3%
 
300578355.8%
 
21673090.7%
 
72818510.2%
 
12082< 0.1%
 
102470< 0.1%
 
48051< 0.1%
 
76830< 0.1%
 
ValueCountFrequency (%) 
12082< 0.1%
 
21673090.7%
 
300578355.8%
 
32093277293.3%
 
48051< 0.1%
 
72818510.2%
 
76830< 0.1%
 
102470< 0.1%
 
ValueCountFrequency (%) 
102470< 0.1%
 
76830< 0.1%
 
72818510.2%
 
48051< 0.1%
 
32093277293.3%
 
300578355.8%
 
21673090.7%
 
12082< 0.1%
 

D
Real number (ℝ≥0)

Distinct count9
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean60.096674
Minimum20
Maximum1024
Zeros0
Zeros (%)0.0%
Memory size7.6 MiB

Quantile statistics

Minimum20
5-th percentile50
Q150
median50
Q350
95-th percentile50
Maximum1024
Range1004
Interquartile range (IQR)0

Descriptive statistics

Standard deviation47.20946226
Coefficient of variation (CV)0.7855586527
Kurtosis33.40002178
Mean60.096674
Median Absolute Deviation (MAD)0
Skewness5.186969055
Sum60096674
Variance2228.733327
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
5094335694.3%
 
250447124.5%
 
3673090.7%
 
48025390.3%
 
9018510.2%
 
2082< 0.1%
 
76870< 0.1%
 
32051< 0.1%
 
102430< 0.1%
 
ValueCountFrequency (%) 
2082< 0.1%
 
3673090.7%
 
5094335694.3%
 
9018510.2%
 
250447124.5%
 
32051< 0.1%
 
48025390.3%
 
76870< 0.1%
 
102430< 0.1%
 
ValueCountFrequency (%) 
102430< 0.1%
 
76870< 0.1%
 
48025390.3%
 
32051< 0.1%
 
250447124.5%
 
9018510.2%
 
5094335694.3%
 
3673090.7%
 
2082< 0.1%
 

E
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count416
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2113.1171
Minimum112
Maximum2757
Zeros0
Zeros (%)0.0%
Memory size7.6 MiB

Quantile statistics

Minimum112
5-th percentile547
Q11863
median2323
Q32526
95-th percentile2691
Maximum2757
Range2645
Interquartile range (IQR)663

Descriptive statistics

Standard deviation608.8224622
Coefficient of variation (CV)0.2881158182
Kurtosis2.269419741
Mean2113.1171
Median Absolute Deviation (MAD)301
Skewness-1.639087222
Sum2113117100
Variance370664.7905
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
172211125011.1%
 
2424379353.8%
 
2227366773.7%
 
1800295813.0%
 
423233842.3%
 
2480229572.3%
 
2502211932.1%
 
2528205392.1%
 
2506196902.0%
 
2374185311.9%
 
2545176501.8%
 
1872172001.7%
 
1994151211.5%
 
2526146931.5%
 
2299144101.4%
 
1863141251.4%
 
761138351.4%
 
2333126721.3%
 
1993121411.2%
 
2665118621.2%
 
2676118161.2%
 
1873115741.2%
 
2507108961.1%
 
2726101661.0%
 
256689800.9%
 
Other values (391)46112246.1%
 
ValueCountFrequency (%) 
11259740.6%
 
12257120.6%
 
153324< 0.1%
 
17862550.6%
 
196105< 0.1%
 
394432< 0.1%
 
423233842.3%
 
47950140.5%
 
54424980.2%
 
54737030.4%
 
ValueCountFrequency (%) 
275765< 0.1%
 
2756263< 0.1%
 
275512870.1%
 
275415230.2%
 
2753101< 0.1%
 
27498490.1%
 
27488760.1%
 
274721610.2%
 
2745108< 0.1%
 
2743123< 0.1%
 

F
Categorical

Distinct count4
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
0
418976
3
337978
2
175623
1
 
67423
ValueCountFrequency (%) 
041897641.9%
 
333797833.8%
 
217562317.6%
 
1674236.7%
 

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
041897641.9%
 
333797833.8%
 
217562317.6%
 
1674236.7%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number1000000100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
041897641.9%
 
333797833.8%
 
217562317.6%
 
1674236.7%
 

Most occurring scripts

ValueCountFrequency (%) 
Common1000000100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
041897641.9%
 
333797833.8%
 
217562317.6%
 
1674236.7%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1000000100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
041897641.9%
 
333797833.8%
 
217562317.6%
 
1674236.7%
 

G
Real number (ℝ≥0)

Distinct count66
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean226.9311
Minimum33
Maximum1839
Zeros0
Zeros (%)0.0%
Memory size7.6 MiB

Quantile statistics

Minimum33
5-th percentile35
Q135
median39
Q3171
95-th percentile1063
Maximum1839
Range1806
Interquartile range (IQR)136

Descriptive statistics

Standard deviation350.4800993
Coefficient of variation (CV)1.544433968
Kurtosis3.857232288
Mean226.9311
Median Absolute Deviation (MAD)4
Skewness2.157856948
Sum226931100
Variance122836.3
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
3530003930.0%
 
3921869721.9%
 
167779337.8%
 
161392603.9%
 
47360003.6%
 
1327269592.7%
 
297252722.5%
 
163229692.3%
 
175201102.0%
 
679183111.8%
 
935175011.8%
 
687138331.4%
 
41128121.3%
 
1063127761.3%
 
33117571.2%
 
431106111.1%
 
803101661.0%
 
131996061.0%
 
41981070.8%
 
30379880.8%
 
17175630.8%
 
16973150.7%
 
29973120.7%
 
42767480.7%
 
3466840.7%
 
Other values (41)636716.4%
 
ValueCountFrequency (%) 
33117571.2%
 
3466840.7%
 
3530003930.0%
 
3845100.5%
 
3921869721.9%
 
41128121.3%
 
4352770.5%
 
4581< 0.1%
 
47360003.6%
 
161392603.9%
 
ValueCountFrequency (%) 
1839276< 0.1%
 
1835467< 0.1%
 
18315950.1%
 
171123360.2%
 
158392< 0.1%
 
1575496< 0.1%
 
145132630.3%
 
14473< 0.1%
 
1327269592.7%
 
131996061.0%
 

H
Real number (ℝ)

Distinct count164
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean53279.713029
Minimum-1
Maximum100248
Zeros0
Zeros (%)0.0%
Memory size7.6 MiB

Quantile statistics

Minimum-1
5-th percentile-1
Q1-1
median100049
Q3100093
95-th percentile100190
Maximum100248
Range100249
Interquartile range (IQR)100094

Descriptive statistics

Standard deviation49952.76706
Coefficient of variation (CV)0.9375569842
Kurtosis-1.983339436
Mean53279.71303
Median Absolute Deviation (MAD)144
Skewness-0.129082552
Sum5.327971303e+10
Variance2495278937
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-146779646.8%
 
100084601576.0%
 
100148441254.4%
 
100111427404.3%
 
100077392273.9%
 
100075386133.9%
 
100081331443.3%
 
100083265592.7%
 
100156254622.5%
 
100193174461.7%
 
100176160901.6%
 
100074145601.5%
 
100079140111.4%
 
100189115791.2%
 
100076113681.1%
 
10019259630.6%
 
10019057230.6%
 
10019155410.6%
 
10018854130.5%
 
10001350000.5%
 
10003145580.5%
 
10015538850.4%
 
10019436680.4%
 
10018136080.4%
 
10000035230.4%
 
Other values (139)902419.0%
 
ValueCountFrequency (%) 
-146779646.8%
 
10000035230.4%
 
100001199< 0.1%
 
100002180< 0.1%
 
10000328590.3%
 
10000421280.2%
 
10000517070.2%
 
1000061< 0.1%
 
10001030< 0.1%
 
100012357< 0.1%
 
ValueCountFrequency (%) 
100248345< 0.1%
 
10024424< 0.1%
 
1002415960.1%
 
10023329300.3%
 
10022951< 0.1%
 
10022815960.2%
 
100225174< 0.1%
 
10022445< 0.1%
 
10022118410.2%
 
10021711040.1%
 

I
Real number (ℝ≥0)

Distinct count60
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean83.386604
Minimum1
Maximum255
Zeros0
Zeros (%)0.0%
Memory size7.6 MiB

Quantile statistics

Minimum1
5-th percentile23
Q123
median61
Q3101
95-th percentile221
Maximum255
Range254
Interquartile range (IQR)78

Descriptive statistics

Standard deviation70.30124804
Coefficient of variation (CV)0.8430760418
Kurtosis-0.2603528011
Mean83.386604
Median Absolute Deviation (MAD)38
Skewness1.092674088
Sum83386604
Variance4942.265476
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2321974922.0%
 
22112515212.5%
 
7911385011.4%
 
48536915.4%
 
71522195.2%
 
61511435.1%
 
157455194.6%
 
32439484.4%
 
33370833.7%
 
52295293.0%
 
42252102.5%
 
51212762.1%
 
15188481.9%
 
212163941.6%
 
43146321.5%
 
117102361.0%
 
229101661.0%
 
1395541.0%
 
1687260.9%
 
15682660.8%
 
6880510.8%
 
15972780.7%
 
9568770.7%
 
4657670.6%
 
24649080.5%
 
Other values (35)519285.2%
 
ValueCountFrequency (%) 
181< 0.1%
 
1395541.0%
 
15188481.9%
 
1687260.9%
 
1741150.4%
 
20324< 0.1%
 
2321974922.0%
 
32439484.4%
 
33370833.7%
 
3511680.1%
 
ValueCountFrequency (%) 
255108< 0.1%
 
25319210.2%
 
2515230.1%
 
24649080.5%
 
229101661.0%
 
22112515212.5%
 
21937< 0.1%
 
212163941.6%
 
20423010.2%
 
19596< 0.1%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

df_indexlikehoursidsdomainscataidadomainacatdiddipdmodeldtypedconnposABCDEFGHI
0517143211912220685f751fdc4e18dd650e219e0fb7c70a3d9b5648e0f2161f8337bf80956d6c8a9aad45b01100100521747320502504341100160111
124755502019122707d6137915bb1ef334f028772becad23867801e8d907d7df22a99f214afe02cd8a8a4875bd100100520213320502316016710008116
222223641019122613e59ef3fc0a4015b2335d28a8ecad23867801e8d907d7df22a99f214afa56a0ec853073be101100519771320502227093510007948
33245836401912290085f751fdc4e18dd650e219e0e2fcccd25c5a694b0f2161f8a32156189a10a7ebbe74e6fe100100546873205042323910014832
460274381191222094bf5bbe26b560cc128905ebdecad23867801e8d907d7df22a99f214ae55e55a1836d24391001005217893205025122303-152
510310046019122305b7e9786db12b9f85f028772becad23867801e8d907d7df22a99f214a9678ccaba0f5f879101100516208320501800316710007723
63676918301912300485f751fdc4e18dd650e219e066f5e02e6f7ca2ba0f2161f879bc0e4fc72bfab02891f3841001005238043205027263803100091229
7187571240191225156c5b482c7687a86e3e814130ecad23867801e8d907d7df22a99f214ac782a68484ebbcd41001005176543002501994239-133
8182194110191225121fbe01fef384576728905ebdecad23867801e8d907d7df22a99f214a6f669b29cdf6ea96100100515708320501722035-179
94008661401912302085f751fdc4e18dd650e219e0f0d41ff12347f47a0f2161f8a99f214abaa7e5492de871e6100100524040320502756329910011261

Last rows

df_indexlikehoursidsdomainscataidadomainacatdiddipdmodeldtypedconnposABCDEFGHI
99999011489892019122311b7e9786db12b9f85f028772becad23867801e8d907d7df22a99f214ab3bb49595ec4588310110052212032050170201059100079110
9999918001580019122214a78530077e091613f028772becad23867801e8d907d7df22a99f214a009a78613bb1ddd710110059478320509063145110015661
9999922375365201912262285f751fdc4e18dd650e219e0ce183bbdae637522cef3e649a99f214ad8b9fb6436b67a2a100100522516320502597116710000571
999993208439740191226065114c6723f2f38193e814130ecad23867801e8d907d7df22a99f214ae21147424ea23a131011005197713205022270679-148
9999941112607601912230985f751fdc4e18dd650e219e054c5d5452347f47a0f2161f8a99f214ab6d940b0d787e91b100100515702320501722035-179
99999514043529019122405e151e2457e091613f028772becad23867801e8d907d7df22a99f214a30e8f0b71f0bc64f101100521679320502495216710017323
9999962194882101912261285f751fdc4e18dd650e219e092f5800bae6375220f2161f8a99f214a777a32ec496515fa130100521189320502424116110019371
9999975335940191221031fbe01fef384576728905ebdecad23867801e8d907d7df22a99f214ad8a5f6c9711ee12010010051570632050172203510008479
999998189908690191225161fbe01fef384576728905ebdecad23867801e8d907d7df22a99f214aed10de3f8a4875bd10010051570132050172203510008479
999999245796420191227061fbe01fef384576728905ebdecad23867801e8d907d7df22a99f214add570a33a8d2c4cf10010052267632050261603510008351